Cross-validation and cross-study validation of chronic lymphocytic leukaemia with exome sequences and machine learning

نویسندگان

  • Abdulrhman Aljouie
  • Nihir Patel
  • Bharati Jadhav
  • Usman Roshan
چکیده

The era of genomics brings the potential of better DNA-based risk prediction and treatment. We explore this problem for chronic lymphocytic leukaemia that is one of the largest whole exome data set available from the NIH dbGaP database. We perform a standard next-generation sequence procedure to obtain Single-Nucleotide Polymorphism (SNP) variants and obtain a peak mean accuracy of 82% in our cross-validation study. We also cross-validate an Affymetrix 6.0 genome-wide association study of the same samples where we find a peak accuracy of 57%. We then perform a cross-study validation with exome samples from other studies in the NIH dbGaP database serving as the external data set. There we obtain an accuracy of 70% with top Pearson ranked SNPs obtained from the original exome data set. Our study shows that even with a small sample size we can obtain moderate to high accuracy with exome sequences, which is encouraging for future work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-validation and cross-study validation of kidney cancer with machine learning and whole exome sequences from the National Cancer Institute

Accurate cancer risk prediction from genetic and environment variables is a key problem in medicine. One approach is to use somatic mutations which could potentially be used in early detection and prevention. SNP based studies are the most common ones utilizing this approach, however most studies lack a cross-study validation component across at least two independent studies. Here we explore th...

متن کامل

Modeling Discharge Coefficient of Side Weir on Converging Channel Using Extreme Learning Machine

In this study, the discharge coefficient of side weirs located on converging channels was simulated for the first time using a new method of Extreme Learning Machine (ELM). To examine the accuracy of the numerical model, the Monte Carlo simulations were used and the experimental values validation was conducted by the k-fold cross validation method. Then, the input parameters were detected for s...

متن کامل

Determining the progression stages of liver fibrosis in patients with chronic hepatitis B

Introduction: Chronic hepatitis B (CHB) leads to liver fibrosis, its failure, and death in the long term. The stage of fibrosis in CHB patients can also be detected based on the biochemical markers. The aim of this study was to predict the state of liver fibrosis in CHB patients and determine the possibility of patients shifting from a given state to another one. Materials and Methods: This stu...

متن کامل

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

Automatic road crack detection and classification using image processing techniques, machine learning and integrated models in urban areas: A novel image binarization technique

The quality of the road pavement has always been one of the major concerns for governments around the world. Cracks in the asphalt are one of the most common road tensions that generally threaten the safety of roads and highways. In recent years, automated inspection methods such as image and video processing have been considered due to the high cost and error of manual metho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJDMB

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2016